Goto

Collaborating Authors

 recurrent convolutional neural network


Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Neural Information Processing Systems

Recent advances in the design of neural network architectures, in particular those specialized in modeling sequences, have provided significant improvements in speech separation performance. In this work, we propose to use a bio-inspired architecture called Fully Recurrent Convolutional Neural Network (FRCNN) to solve the separation task. This model contains bottom-up, top-down and lateral connections to fuse information processed at various time-scales represented by stages. In contrast to the traditional approach updating stages in parallel, we propose to first update the stages one by one in the bottom-up direction, then fuse information from adjacent stages simultaneously and finally fuse information from all stages to the bottom stage together. Experiments showed that this asynchronous updating scheme achieved significantly better results with much fewer parameters than the traditional synchronous updating scheme on speech separation. In addition, the proposed model achieved competitive or better results with high efficiency as compared to other state-of-the-art approaches on two benchmark datasets.


All Eyes, no IMU: Learning Flight Attitude from Vision Alone

Hagenaars, Jesse J., Stroobants, Stein, Bohte, Sander M., De Croon, Guido C. H. E.

arXiv.org Artificial Intelligence

Vision is an essential part of attitude control for many flying animals, some of which have no dedicated sense of gravity. Flying robots, on the other hand, typically depend heavily on accelerometers and gyroscopes for attitude stabilization. In this work, we present the first vision-only approach to flight control for use in generic environments. We show that a quadrotor drone equipped with a downward-facing event camera can estimate its attitude and rotation rate from just the event stream, enabling flight control without inertial sensors. Our approach uses a small recurrent convolutional neural network trained through supervised learning. Real-world flight tests demonstrate that our combination of event camera and low-latency neural network is capable of replacing the inertial measurement unit in a traditional flight control loop. Furthermore, we investigate the network's generalization across different environments, and the impact of memory and different fields of view. While networks with memory and access to horizon-like visual cues achieve best performance, variants with a narrower field of view achieve better relative generalization. Our work showcases vision-only flight control as a promising candidate for enabling autonomous, insect-scale flying robots.


Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Neural Information Processing Systems

Recent advances in the design of neural network architectures, in particular those specialized in modeling sequences, have provided significant improvements in speech separation performance. In this work, we propose to use a bio-inspired architecture called Fully Recurrent Convolutional Neural Network (FRCNN) to solve the separation task. This model contains bottom-up, top-down and lateral connections to fuse information processed at various time-scales represented by stages. In contrast to the traditional approach updating stages in parallel, we propose to first update the stages one by one in the bottom-up direction, then fuse information from adjacent stages simultaneously and finally fuse information from all stages to the bottom stage together. Experiments showed that this asynchronous updating scheme achieved significantly better results with much fewer parameters than the traditional synchronous updating scheme on speech separation.


Brief Review -- RCNN: Recurrent Convolutional Neural Network for Object Recognition

#artificialintelligence

Similar idea is used in PolyInception Modules as in PolyNet, and PolyNet got 2nd Runner Up in ILSVRC 2016 Image Classification. 1989 … 2015 [RCNN] … 2021 [Learned Resizer] [Vision Transformer, ViT]…


LIMSI_UPV at SemEval-2020 Task 9: Recurrent Convolutional Neural Network for Code-mixed Sentiment Analysis

Banerjee, Somnath, Ghannay, Sahar, Rosset, Sophie, Vilnat, Anne, Rosso, Paolo

arXiv.org Artificial Intelligence

This paper describes the participation of LIMSI UPV team in SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text. The proposed approach competed in SentiMix Hindi-English subtask, that addresses the problem of predicting the sentiment of a given Hindi-English code-mixed tweet. We propose Recurrent Convolutional Neural Network that combines both the recurrent neural network and the convolutional network to better capture the semantics of the text, for code-mixed sentiment analysis. The proposed system obtained 0.69 (best run) in terms of F1 score on the given test data and achieved the 9th place (Codalab username: somban) in the SentiMix Hindi-English subtask.


Recurrent Convolutional Neural Networks help to predict location of Earthquakes

Kail, Roman, Zaytsev, Alexey, Burnaev, Evgeny

arXiv.org Machine Learning

We examine the applicability of modern neural network architectures to the midterm prediction of earthquakes. Our data-based classification model aims to predict if an earthquake with the magnitude above a threshold takes place at a given area of size $10 \times 10$ kilometers in $10$-$60$ days from a given moment. Our deep neural network model has a recurrent part (LSTM) that accounts for time dependencies between earthquakes and a convolutional part that accounts for spatial dependencies. Obtained results show that neural networks-based models beat baseline feature-based models that also account for spatio-temporal dependencies between different earthquakes. For historical data on Japan earthquakes our model predicts occurrence of an earthquake in $10$ to $60$ days from a given moment with magnitude $M_c > 5$ with quality metrics ROC AUC $0.975$ and PR AUC $0.0890$, making $1.18 \cdot 10^3$ correct predictions, while missing $2.09 \cdot 10^3$ earthquakes and making $192 \cdot 10^3$ false alarms. The baseline approach has similar ROC AUC $0.992$, number of correct predictions $1.19 \cdot 10^3$, and missing $2.07 \cdot 10^3$ earthquakes, but significantly worse PR AUC $0.00911$, and number of false alarms $1004 \cdot 10^3$.

  Country:
  Genre: Research Report > New Finding (0.34)
  Industry: Energy (0.48)

Deep Learning Techniques for Text Classification

#artificialintelligence

Deep learning models have achieved state-of-the-art results across many domains. RMDL solves the problem of finding the best deep learning structure and architecture while simultaneously improving robustness and accuracy through ensembles of different deep learning architectures. RDMLs can accept a variety of data as input including text, video, images, and symbols.


Visual Depth Mapping from Monocular Images using Recurrent Convolutional Neural Networks

Mern, John, Julian, Kyle, Tompa, Rachael E., Kochenderfer, Mykel J.

arXiv.org Artificial Intelligence

A reliable sense-and-avoid system is critical to enabling safe autonomous operation of unmanned aircraft. Existing sense-and-avoid methods often require specialized sensors that are too large or power intensive for use on small unmanned vehicles. This paper presents a method to estimate object distances based on visual image sequences, allowing for the use of low-cost, on-board monocular cameras as simple collision avoidance sensors. We present a deep recurrent convolutional neural network and training method to generate depth maps from video sequences. Our network is trained using simulated camera and depth data generated with Microsoft's AirSim simulator. Empirically, we show that our model achieves superior performance compared to models generated using prior methods.We further demonstrate that the method can be used for sense-and-avoid of obstacles in simulation.


AI can untangle the jumble of neurons packed in brain scans

#artificialintelligence

Video AI can help neurologists automatically map the connections between different neurons in brain scans, a tedious task that can take hundreds and thousands of hours. In a paper published in Nature Methods, AI researchers from Google collaborated with scientists from the Max Planck Institute of Neurobiology to inspect the brain of a Zebra Finch, a small Australian bird renowned for its singing. Although the contents of their craniums are small, Zebra Finches aren't birdbrains, their connectome* is densely packed with neurons. To study the connections, scientists study a slice of the brain using an electron microscope. It requires high resolution to make out all the different neurites, the nerve cells extending from neurons.


AI can untangle the jumble of neurons packed in brain scans

#artificialintelligence

Video AI can help neurologists automatically map the connections between different neurons in brain scans, a tedious task that can take hundreds and thousands of hours. In a paper published in Nature Methods, AI researchers from Google collaborated with scientists from the Max Planck Institute of Neurobiology to inspect the brain of a Zebra Finch, a small Australian bird renowned for its singing. Although the contents of their craniums are small, Zebra Finches aren't birdbrains, their connectome* is densely packed with neurons. To study the connections, scientists study a slice of the brain using an electron microscope. It requires high resolution to make out all the different neurites, the nerve cells extending from neurons.